Differentiable Architecture Search (DARTS) has attracted considerable attention as a gradient-based Neural Architecture Search (NAS) method. Since the introduction of DARTS, there has been little work done on adapting the action space based on state-of-art architecture design principles for CNNs. In this work, we aim to address this gap by incrementally augmenting the DARTS search space with micro-design changes inspired by ConvNeXt and studying the trade-off between accuracy, evaluation layer count, and computational cost. To this end, we introduce the Pseudo-Inverted Bottleneck conv block intending to reduce the computational footprint of the inverted bottleneck block proposed in ConvNeXt. Our proposed architecture is much less sensitive to evaluation layer count and outperforms a DARTS network with similar size significantly, at layer counts as small as 2. Furthermore, with less layers, not only does it achieve higher accuracy with lower GMACs and parameter count, GradCAM comparisons show that our network is able to better detect distinctive features of target objects compared to DARTS.
translated by 谷歌翻译
计算机愿景领域正在快速发展,特别是在神经结构设计的新方法的背景下。这些模型有助于(1)气候危机 - 增加二氧化碳排放量和(2)隐私危机 - 数据泄漏问题。为了解决经常忽视的影响计算机愿景(CV)社区对这些危机,我们概述了一个新颖的道德框架,\ Textit {P4ai}:AI的原则,是AI内伦理困境的增强原则看法。然后,我们建议使用P4AI向社区制定具体的建议,以减轻气候和隐私危机。
translated by 谷歌翻译
气候变化仍然是一个迫在眉睫的问题,目前影响社会大。重要的是,我们作为一个社会,包括计算机愿景(CV)社区采取措施限制对环境的影响。在本文中,我们(a)分析了CV方法递减递减的效果,(b)提出了一种\ entyit {'nofade''}:一种基于新的基于熵的度量来量化模型 - 数据集 - 复杂性关系。我们表明一些简历的任务正在达到饱和度,而其他CV任务几乎完全饱和。在这种光中,Nofade允许CV社区在类似的基础上比较模型和数据集,建立不良平台。
translated by 谷歌翻译
超参数优化(HPO)的任务是由于同时优化了模型的权重及其超参数的诡计而负担了重大计算成本。在这项工作中,我们介绍了一类新的HPO方法,并探讨了卷积神经网络的中间层的卷积重量的低级分解权,可用于定义用于优化超参数的分析响应表面,仅使用培训数据。我们量化了这种表面如何表现为模型性能的代理,并且可以使用我们呼叫AutoOlep的信任区域搜索算法来解决。该算法优于诸如贝叶斯优化的最先进,横跨模型,优化器和数据集选择。可以在\ url {https://github.com/mathieutuli/autohoyper}中找到pytorch代码。
translated by 谷歌翻译
联邦学习(FL)是一种分散的方法,使医院能够在不共享私人患者数据进行培训的情况下协作学习模型。在FL中,参与者医院定期交换培训结果,而不是使用中央服务器培训样品。但是,访问模型参数或梯度可以暴露私人培训数据样本。为了应对这一挑战,我们采用安全的多方计算(SMC)来建立一个保护隐私的联合学习框架。在我们提出的方法中,医院分为集群。在当地培训之后,每家医院在同一集群中分解了其他医院的模型权重,因此没有一家医院可以自己检索其他医院的体重。然后,所有医院总结了收到的权重,将结果发送到中央服务器。最后,中央服务器汇总了结果,检索模型的平均权重并更新模型,而无需访问各个医院的权重。我们在公开可用的存储库《癌症基因组图集》(TCGA)上进行实验。我们将提议框架的性能与差异隐私进行比较,并将平均为基准。结果表明,与差异隐私相比,我们的框架可以实现更高的准确性,而没有隐私泄漏风险,而较高的通信开销则可以实现。
translated by 谷歌翻译
基于多种假设,现实世界中的数据通常位于低维的流形上,而将流动作为基于可能性的生成模型的标准化是由于其结构约束而无法找到这种歧管的能力。因此,出现了一个有趣的问题:$ \ textit {“我们可以在标准化流程中找到数据的子manifold(s),并估计子序列上的数据密度吗?”} $。在本文中,我们介绍了两种方法,即每像素的惩罚对数类样和等级培训,以回答上述问题。我们提出了一种单步方法,用于通过将流量标准化为歧管和偏移部分获得的转换空间,来进行关节流形学习和密度估计。这是由每像素惩罚的可能性函数来完成数据的,以学习数据的子字符。标准化流程假设转换的数据是高斯化的,但是这种施加的假设不一定是正确的,尤其是在高维度中。为了解决这个问题,采用了一种分层培训方法来改善子序列的密度估计。结果验证了在产生的图像质量和可能性方面使用归一化流的同时流动学习和密度估算中提出方法的优越性。
translated by 谷歌翻译
我们展示了MapReader,一个在Python中编写的免费开源软件库,用于分析大地图集合(扫描或出生)。此库转换历史人员可以通过转动广泛的均匀地图设置到可搜索的主要源来使用映射的方式。 MapReader允许使用很少或没有计算机视觉专业知识的用户来通过Web服务器检索地图; ii)预处理并将它们分成补丁; iii)涂布补丁; iv)火车,微调和评估深度神经网络模型; v)创建有关地图内容的结构化数据。我们展示了MAPREADER如何使历史学家解释$ \ \左右16千世纪的军械调查地图表($ \大约30.5M补丁),将视觉标记转化为机器可读数据的挑战。我们展示了一个案例研究,重点是英国铁路基础设施和建筑物,如这些地图所示。我们还展示了MapReader管道的输出如何链接到我们用于评估的其他外部数据集以及丰富和解释结果。我们释放$ \大约62万美元手动注释的补丁,用于培训和评估模型。
translated by 谷歌翻译
Designing experiments often requires balancing between learning about the true treatment effects and earning from allocating more samples to the superior treatment. While optimal algorithms for the Multi-Armed Bandit Problem (MABP) provide allocation policies that optimally balance learning and earning, they tend to be computationally expensive. The Gittins Index (GI) is a solution to the MABP that can simultaneously attain optimality and computationally efficiency goals, and it has been recently used in experiments with Bernoulli and Gaussian rewards. For the first time, we present a modification of the GI rule that can be used in experiments with exponentially-distributed rewards. We report its performance in simulated 2- armed and 3-armed experiments. Compared to traditional non-adaptive designs, our novel GI modified design shows operating characteristics comparable in learning (e.g. statistical power) but substantially better in earning (e.g. direct benefits). This illustrates the potential that designs using a GI approach to allocate participants have to improve participant benefits, increase efficiencies, and reduce experimental costs in adaptive multi-armed experiments with exponential rewards.
translated by 谷歌翻译
Quadruped robots are currently used in industrial robotics as mechanical aid to automate several routine tasks. However, presently, the usage of such a robot in a domestic setting is still very much a part of the research. This paper discusses the understanding and virtual simulation of such a robot capable of detecting and understanding human emotions, generating its gait, and responding via sounds and expression on a screen. To this end, we use a combination of reinforcement learning and software engineering concepts to simulate a quadruped robot that can understand emotions, navigate through various terrains and detect sound sources, and respond to emotions using audio-visual feedback. This paper aims to establish the framework of simulating a quadruped robot that is emotionally intelligent and can primarily respond to audio-visual stimuli using motor or audio response. The emotion detection from the speech was not as performant as ERANNs or Zeta Policy learning, still managing an accuracy of 63.5%. The video emotion detection system produced results that are almost at par with the state of the art, with an accuracy of 99.66%. Due to its "on-policy" learning process, the PPO algorithm was extremely rapid to learn, allowing the simulated dog to demonstrate a remarkably seamless gait across the different cadences and variations. This enabled the quadruped robot to respond to generated stimuli, allowing us to conclude that it functions as predicted and satisfies the aim of this work.
translated by 谷歌翻译
Real-world robotic grasping can be done robustly if a complete 3D Point Cloud Data (PCD) of an object is available. However, in practice, PCDs are often incomplete when objects are viewed from few and sparse viewpoints before the grasping action, leading to the generation of wrong or inaccurate grasp poses. We propose a novel grasping strategy, named 3DSGrasp, that predicts the missing geometry from the partial PCD to produce reliable grasp poses. Our proposed PCD completion network is a Transformer-based encoder-decoder network with an Offset-Attention layer. Our network is inherently invariant to the object pose and point's permutation, which generates PCDs that are geometrically consistent and completed properly. Experiments on a wide range of partial PCD show that 3DSGrasp outperforms the best state-of-the-art method on PCD completion tasks and largely improves the grasping success rate in real-world scenarios. The code and dataset will be made available upon acceptance.
translated by 谷歌翻译